Principal component analysis (PCA) will be used as a dimensionality reduction technique to find the over-arching dimensions that represent knowledge about social relationships. In this study, we will explore the dimensions that are revealed when we consider a comprehensive list of social relationships rated on a comprehensive list of dimensions from the previous literature on social relationship knowledge.
This dataset was collected from a survey hosted on mturk. The survey data was cleaned with a separate python script. A matrix was created for the average rating of social relationships on dimensions that are thought to characterize these relationships. The relationships list was created using lexical word vector tools to generate a list of all possible social relationships (159 in total). The dimensions were all of the previous dimensions that have been proposed in the literature.
PCA will output the same number of components as there are dimension inputs. As the components are ranked by how much variance they explain, we can exclude some components which do not add much additional information.
We will use parallel analysis to indicate what the optimal number of components to include would be.
## Parallel analysis suggests that the number of factors = NA and the number of components = 4
## png
## 2
## Parallel analysis suggests that the number of factors = NA and the number of components = 4
Parallel analysis indicates that having 4 components would be optimal.
PCA with no rotation is done here to visualize the amount of variance accounted for by each component.
## png
## 2
Rotations are used in principal component analyses to be able to better interpret the data. There are two main types of rotations, varimax and oblimin. Here, we will use varimax rotation, as it will maximize the component loadings so that dimensions are more strongly loaded onto a single component, rather than across components. Because of this, our resulting components may correlate with each other. Oblimin rotation results in components that are uncorrelated to each other.
## [1] "First four components account for 77.45% of the variance"
## Component 1 highest positive loadings: Activity Intensity, Attachment, Communal Sharing, Endurance, Equality, Intimacy, Love Expression, Mating, Socioemotional, UniquenessNULL
##
## Component 1 highest negative loadings: Formality and Regulation, Importance for society, Occupational, Service Exchange, Strategic, VisibilityNULL
##
## Component 2 highest positive loadings: Activeness, Activity Intensity, Attachment, Importance for individuals involved, Importance for society, Information Exchange, Spatial Distance, Synchronicity, UniquenessNULL
##
## Component 2 highest negative loadings: NULL
##
## Component 3 highest positive loadings: Affiliation Coalition, Expected Reciprocity, Love Expression, Valence EvaluationNULL
##
## Component 3 highest negative loadings: Coercion, ConflictNULL
##
## Component 4 highest positive loadings: Concreteness, Goods Exchange, Money Exchange, Negotiation, Service ExchangeNULL
##
## Component 4 highest negative loadings: NULL
PC1 = Formality
PC2 = Activeness
PC3 = Valence
PC4 = Negotiation
We have three of the same components seen in the previous studies. In the present study, the third component, Profit, is new and describes a new feature space.
Here we will compare the results of study 3B, which explored the representational space of 25 relationships on 30 dimensions from the literature, and study 3B, which explored the representational space of 159 relationships on 30 dimensions from the literature. This analysis will show how the social relationships feature space can change based on the relationships that are sampled.
## [1] "Study 3A PC1 is most strongly correlated to Study 3B PC1 (rho = 0.9511, p = 0)"
## [1] "Study 3A PC2 is most strongly correlated to Study 3B PC2 (rho = 0.6578, p = 0.000113818919914616)"
## [1] "Study 3A PC3 is most strongly correlated to Study 3B PC4 (rho = 0.8234, p = 9.59406013092002e-07)"
## [1] "Study 3A PC4 is most strongly correlated to Study 3B PC1 (rho = -0.6565, p = 0.000118651895070458) and to Study 3B PC3 (rho = 0.6436, p = 0.000175618115427518)"
| Study 3A Components | Study 3B Components | |
|---|---|---|
| PC1 | Formality | Formality |
| PC2 | Activeness | Activeness |
| PC3 | Negotiation | Valence |
| PC4 | Valence | Negotiation |
The component loadings between the two studies are moderately to strongly correlated (> 0.50). However, there are difference between components that are named the same, indicating that the loadings have shifted due to a lack of relationship variety.
The correlation between relationship component scores are very strong. The “relationship-space” between the two studies is very similar.
Note: For both the loading comparison and the relationship score comparison, I used a Spearman correlation to account for difference in the value distributions of the two studies. For example, PC1 for Study 3B ranges from -20 to 20, but for Study 3B, it ranges from -2 to 2, even though these are the same component (Formality). Maybe I should norm the scores and then correlate them?
Next, we will take a closer took to see if the same components are highly correlated between studies (i.e. formality from study 3A and formality from study 3B)
We have correlated the relationship scores across all of the components for all of the studies. There is strong consistency, where the specific component of one study (i.e. valence) is strongly correlated with the same component from another study.